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Abstract. One of the most annoying aspects in the formalization of 
^~j mathematics is the need of transforming notions to match a given, ex- 

isting result. This kind of transformations, often based on a conspicuous 

^^ background knowledge in the given scientific domain (mostly expressed in 

the form of equalities or isomorphisms) , are usually implicit in the math- 

CO ematical discourse, and it would be highly desirable to obtain a similar 

behaviour in interactive provers. The paper describes the superposition- 
based implementation of this feature inside the Matita interactive the- 
orem prover, focusing in particular on the so called smart application 
tactic, supporting smart matching between a goal and a given result. 



o 
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1 Introduction 

*Z The mathematical language has a deep contextual nature, whose interpretation 

—j. often presupposes not trivial skills in the given mathematical discipline. The 

(f-) most common and typical example of these "logical abuses" is the implicit use 

^^ of equalities and isomorphisms, allowing a mathematician to freely move between 

\f-s different incarnations of a same entity in a completely implicit way. Equipping 

C^> ITP systems with the capability of reasoning up to equality yields an essential 

^^ improvement of their intelligence, making the communication between the user 

and the machine sensibly easier. 
^ Techniques for equational reasoning have been broadly investigated n the 

realm of automated theorem proving (see eg [7,22,10]). The main deductive 
mechanism is a completion technique [17] attempting to transform a given set of 
equations into a confluent rewriting system so that two terms are equal if and 
only if they have identical normal forms. Not every equational theory can be 
presented as a confluent rewriting system, but one can progressively approximate 
it by means of a refutationally complete method called ordered completion. The 
deductive inference rule used in completion procedures is called superposition: 
it consists of first unifying one side of one equation with a subterm of another, 
and hence rewriting it with the other side. The selection of the two terms to 
be unified is guided by a suitable term ordering, constraining inferences and 
sensibly pruning the search space. 
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Although we are not aware of any work explicitly focused on superposi- 
tion techniques for interactive provers, the integration between fully automatic 
provers (usually covering paramodulation) and interactive ones is a major re- 
search challenge and many efforts have been already done in this direction: for 
instance, KIV has been integrated with the tableau prover 3T A P [1]; HOL has 
been integrated with various first order provers, such as Gandalf [15] and Metis; 
Coq has been integrated with Bliksem [8]; Isabellc was first integrated with a 
purpose-built prover [23] and more recently with Vampire [20] . The problems of 
these integrations are usually of two kinds: (a) there is a technical difficulty in the 
forward and backward translation of the information between systems, due to 
the different underlying logics (ITP systems are usually higher-order, and some 
of them intuitionistic); (b) there is a pragmatical problem in the management 
of the knowledge base to be used by the automatic solver, since it can be huge 
(so we cannot pass it at every invocation), and it grows dynamically (hence, it 
cannot be exported in advance). 

A good point of the superposition calculus (and not the last reason for re- 
stricting the attention to this important fragment) is that point (a), in this con- 
text, becomes relatively trivial (and the translation particularly effective). As for 
point (b), its main consequence is that the communication between the Interac- 
tive Prover and the Problem Solver, in order to be efficient, cannot be stateless: 
the two systems must share a common knowledge base. This fact, joined with 
the freedom to adapt the superposition tool to any possible specific requirement 
of the Matita system convinced us to rewrite our own solver, instead of trying 
to interface Matita with some available tool. This paper discusses our experi- 
ence of implementation of a (first order) superposition calculus (Section 2), its 
integration within the (higher-order) Matita interactive prover [5] (Section 3), 
and in particular its use for the implementation of a smart application tactic, 
supporting smart matching between a goal and a given results (Section 4). We 
shall conclude with a large number of examples of concrete use of this tactic. 



2 The Matita superposition tool 

One of the components of the automation support provided by the Matita inter- 
active theorem prover is a first order, untyped superposition tool. This is a quite 
small and compact application (little more than 3000 lines of OCaml code), well 
separated by the rest of the system. It was entirely rewritten during the sum- 
mer 2009 starting from a previous prototype (some of whose functionalities had 
been outlined in [6]), with the aim to improve both its abstraction and perfor- 
mance. The tool took part to the 22nd CADE ATP System Competition, in the 
unit equality division, scoring in fourth position, beating glorious systems such 
as Otter or Metis [16], and being awarded as the best new entrant tool of the 
competion [28]. 

In the rest of this section we shall give an outline, as concise as possible, 
of the theory and the architecture of the tool. This is important in order to 
understand its integration with the interactive prover. 



Smart matching 3 

2.1 The superposition calculus in a nutshell 

Let T bet a countable alphabet of functional symbols, and V a countable alpha- 
bet of variables. We denote with T{T, V) the set of terms over T with variables 
in V. A term t € T(.F, V) is either a 0-arity element of T (constant), an element 
of V (variable), or an expression of the form /(ii, . . . ,t n ) where / is a element 
of T of arity n and t\ , . . . , t n are terms. 

Let s and r be two terms. s\ p denotes the subterm of s at position p and s[r] p 
denotes the term s where the subterm at position p has been replaced by r. 

A substitution is a mapping from variables to terms. Two terms s and t are 
unifiable if there exists a substitution a such that sa = ta. In the previous case, 
a is called a most general unifier (mgu) of s and t if for all substitution 9 such 
that sQ = t$, there exists a substitution r which satisfies 6 — t o a. 

A literal is either an abstract predicate (represented by a term), or an equality 
between two terms. A clause F h A is a pair of multisets of literals: the negative 
literals F, and the positive ones A. If F = (resp. A — %), the clause is said to 
be positive (resp. negative). 

A Horn clause is a clause with at most one positive literal. A unit clause is 
a clause composed of a single literal. A unit equality is a unit clause where the 
literal is an equality. 

A strict ordering -< over T(J r , V) is a transitive and irreflexive (possibly 
partial) binary relation. An ordering is stable under substitution if s -< t implies 
sa -< ta for all terms t, s and substitutions a. A well founded monotonic ordering 
stable under substitution is called reduction ordering (see [11]). The intuition 
behind the use of reduction ordcrings for limiting the combinatorial explosion of 
new equations during inference, is to only rewrite big terms to smaller ones. 



superposition left superposition right equality resolution 

hl = r ti=t 2 \- \- l = r \-ti=ti ti = ti \- 

(ti[r] p = t 2 \~)a (ii[r-] p = i2r-)<T S 

if a — mgu(l, ti\ p ), ti\ p ^ x,la -£> ra and tier ^ ti<y if 3cr = mgu(t\, £2). 

Fig. 1. Inference rules 



For efficiency reasons, the calculus must be integrated with a few additional 
optimization rules, the most important one being demodulation ([29]). 

2.2 The main algorithm 

A naive implementation of the superposition calculus could just combine (su- 
perpose) all known clauses in all (admitted) ways, and repeat that process until 
the desired clause (called goal) is resolved. To avoid useless duplication of work, 
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subsumption tautology elimination demodulation 

SU{C,D} SU{h( = t} S U {h I = r, C] 



SU{C} S SU{\-l = r,C[ra] p } 

if 3a, Da = C if la = C\ p and la >- ra 

Fig. 2. Simplification rules 



it is convenient to keep clauses in two distinct sets, traditionally called active 
and passive, with the general invariant that clauses in the active set have been 
already composed together in all possible ways. At every step, some clauses are 
selected from the passive set and added to the active set, then superposed with 
the active set, and consequently with themselves (inference). Finally, the newly 
generated clauses are added to the passive set (possibly after a simplification). 

A natural selection strategy, resulting in a very predictable behaviour, would 
consist in selecting the whole passive set at each iteration, in the spirit of a 
breadth first search. Unfortunately the number of new equations generated at 
each step grows extremely fast, in practice preventing the iteratation of the main 
loop more than a few times. 

To avoid this problem, all modern theorem provers (see e.g. [24]) adopt the 
opposite solution. According to some heuristics, like size and goal similarity for 
example, they select only one passive clause at each step. Not to loose complete- 
ness, some fairness conditions are taken into account (i.e. every passive clause 
will be eventually selected). This approach falls under the name given-clause 



Simplification (2) 
Selection (1) Inference 13) 



o o 



selected clause 




new clauses 



Fig. 3. given-clause loop 
Numbers in parentheses reflect the steps order. 



(Figure 3), and its main advantage is that the passive set grows much slower, 
allowing a more focused and deeper inspection of the search space that conse- 
quently allows to find proofs that require a much higher number of main loop 
iterations. 
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The main drawback of this approach is that it makes the procedure way 
more sensible to the selection heuristics, leading to an essentially unpredictable 
behaviour. 



2.3 Performance issues 

In order to obtain a state-of-the-art tool able to compete with the best avail- 
able systems one has eventually to take into account a lot of optimizations and 
techniques developed for this purpose during the last thirty years. 

In the following we shall shortly describe the most critical areas, and, for 
each of them, the approach adopted in Matita. 



Orderings used to orientate rewriting rules On complex problems (e.g. 
problems in the TPTP library with rating greater then 0.30) the choice of a good 
ordering for inference rules is of critical importance. We have implemented sev- 
eral orderings, comprising standard Knuth-Bendix (KBO), non recursive Knuth- 
Bcndix (NRKBO), lexicographic path ordering (LPO) and recursive path order- 
ing (RPO). The best suited ordering heavily depends on the kind of problem, 
and is hard to predict: our approach for the CADE ATP System Competition 
was to run in parallel different processes with different orderings. 

On simpler problems (of the kind required for the smart application tactic 
of section 5), the given-clause algorithm is less sensitive to the term-ordering, 
and we may indifferently choose our preferred strategy, opportunely tuning the 
library (we are currently relying on LPO). 

Selection strategy The selection strategy currently implemented by Matita is 
a based on combination of age and weight. The weight is a positive integer that 
provides an estimation of the "complexity" of the clause, and is tightly related 
to the number of occurrences of symbols in it. 

Since we are not interested in generating (counter) models of false statements, 
we renounced to be complete, and we silently drop inferred clauses that would 
slow down the main loop too much due to their excessive size. 

Another similar optimization we did not implement but we could consider as 
a future development is Limited Resource Strategy [25], which basically allows 
the procedure to skip some inference steps if the resulting clauses are unlikely 
to be processed, mainly because of a lack of time. 



Data structures and code optimization We adopted relatively simple data 
structures (like discrimination [18] trees for term indexing), and a purely func- 
tional (in the sense of functional programming) implementation of them. Af- 
ter some code optimisation, we reached a point where very fast functions are 
the most expensive, because of the number of calls (implied by the number of 
clauses), even if they operate on simple data structures. 
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Since we are quite satisfied with the actual performance, we did not invest 
resources in adopting better data structures, but we believe that further opti- 
mizations will probably require implementing more elaborate data structures, 
such as substitution [14] or context trees [13], or even adopt an indexing tech- 
nique that works modulo associativity and commutativity [12], that looks very 
promising when working on algebraic structures. 

Demodulation Another important issue for performance is demodulation: the 
given clause algorithm spends most of its time (up to 80%) in simplification, 
hence any improvement in this part of the code has a deep impact on perfor- 
mance. However, while reduction strategies, sharing issues and abstract machines 
have been extensively investigated for lambda calculus (and in general for left 
linear systems) less is known for general first order rewriting systems. In particu- 
lar, while an innermost (eager) reduction strategy seem to work generally better 
than an outermost one (especially when combined with lexicographic path or- 
dering), one could easily create examples showing an opposite behaviour (even 
supposing to always reduce needed redexes). 

3 Integrating superposition with Matita 

3.1 Library management 

A possible approach to the integration of superposition with Matita is to solve 
all goals assuming that all equations part of the library lie in the passive set, 
augmented on the fly with the equations in the local context of the ongoing 
proof. 

The big drawback of this approach is that, starting essentially from the same 
set of passive equations at each invocation on a different goal (differing only 
for the local context), the given clause algorithm would mostly repeat the same 
selection and composition operations over and over again. It is clear that, if we 
wish to superpose library equations, this operation should not be done at run 
time but in background, once and for all. Then we have to face a dual problem, 
namely to understand when stopping the saturation of the library with new 
equations, preventing an annoying pollution with trivial results that could have 
very nasty effects for selection and memory occupation. We would eventually 
like to have mechanisms to drive the saturation process. 

A natural compromise is to look at library equations not as a passive set, but 
as the active one. This means that every time a new (unit) equation is added 
to the library it also goes through one main given-clause loop, as if it was the 
newly selected passive equation: it is simplified, composed with all existing active 
equations (i.e. all other equations in the library, up to simplification), and the 
newly created equations are added to the passive list. At run time, we shall then 
strongly privilege selection of local equations or goals. 

This way, we have a natural, simple but traceable syntax to drive the satura- 
tion process, by just listing in library the selected equations. As a side effect, this 
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approach reduces the verbosity of the library by making it unnecessary to declare 
(and name explicitly) trivial variants of available results that are automatically 
generated by superposition. 

3.2 Interfacing CIC and the superposition engine 

Our superposition tool is first order and untyped, while the Matita interactive 
prover is based on a variant of the Calculus of Inductive Construction (CIC), 
a complex higher-order intuitionistic logical systems with dependent types. The 
communication between the two components is hence far from trivial. 

Instead of attempting a complex, faithful encoding of CIC in first order logic 
(that is essentially the approach adopted for HOL in [19]) we choose to follow a 
more naif approach, based on a forgetful translation that remove types and just 
keeps the first order applicative skeleton of CIC-terms. 

In the opposite direction, we try to reconstruct the missing information by 
just exploiting the sophisticated inference capability of the Matita refiner [3], 
that is the tool in charge of transforming the user input into a machine under- 
standable low-level CIC term. 

Automation is thus a best effort service, in the sense that not only it may 
obviously fail to produce a proof, but sometimes it could produce an argument 
that Matita will fail to understand, independently from the fact if the delivered 
proof was "correct" or less. 

The choice to deal with untyped first order equations in the superposition 
tool was mostly done for simplicity and modularity reasons. Moving towards a 
typed setting would require a much tighter integration between the superposition 
tool and the whole system, due to the complexity of typing and unification, but 
does not seem to pose any major theoretical problem. 

The forgetful encoding Equations r =t s of the calculus of constructions are 
translated to first order equations by merely following the applicative structure 
of r and s, and translating any other subterm into an opaque constant. The type 
T of the equation is recorded, but we are not supposed to be able to compute 
types for subtcrms. 

In spite of the fact of neglecting types, the risk of producing "ill-typed" terms 
via superposition rules is moderate. Consider for instance the superposition left 
rule (the reasoning is similar for the other rules) 

\-l = r t 1 =t 2 \- 
(ti[r] p = t 2 \-)<r 

where a — mgu(l,t\\ p ) and la -£ ra. The risk is that t\\ v has a different type 
from I, resulting into an illegal rewriting step. Note however that I and r are 
usually rigid terms, whose type is uniquely determined by the outermost symbol. 
Moreover, t\ \ p cannot be a variable, hence they must share this outermost sym- 
bol. If I is not rigid, it is usually a variable x and if x e r (like e.g. in x = x + 0) 
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we have (in most orderings) I ^ r that again rules out rewriting in the wrong 
direction. 

This leads us to the following notion of admissibility. We say that an applica- 
tive term f(xi, ■ ■ ■ ,x n ) is implicitly typed if its type is uniquely determined by 
the type of /. We say that an equation I = r is admissible if both I and r are 
implicitly typed, or I ■< r and r is implicitly typed. Non admissible equations 
are not taken into account by the superposition tool. 

In practice, most unit equalities are admissible. A typical counter example is 
an equation of the kind Vx, y : unit.x — y, where unit is a singleton type. 

On the other side, non-unit equalities are often not admissible. For instance, 
a clause of the kind x A y — true \- x — true could be used to rewrite any term to 
true, generating meaningless, ill typed clauses. Extending superposition beyond 
the unit equality case does eventually require to take types into consideration. 



3.3 (Re)construction of the proof term 

Translating a first-order resolution proof into a higher-order logic natural deduc- 
tion proof is a notoriously difficult issue, even more delicate in case of intuitionis- 
tic systems, as the one supported by Matita. While resolution per se is a perfectly 
constructive process, skolcmization and transformation into conjunctive normal 
forms are based on classical principles. 

Our choice of focusing on the superposition calculus was also motivated by 
the fact it poses less difficulties, since skolcmization is not needed and thus proofs 
have a rather simple intuitionistic interpretation. 

Our technique for reconstructing a proof term relies as much as possible on 
the refinement capabilities of Matita, in particular for inferring implicit types. 
In the superposition module, each proof step is encoded as a tuple 

Step of rule * int * int * direction * position * substitution 

where rule is the kind of rule which has been applied, the two integers are the two 
id's of the composing equations (referring to a "bag" of unit clauses), direction 
is the direction the second equation is applied to the first one, position is a path 
inside the rewritten term and finally substitution is the mgu required for the 
rewriting step. 

Every superposition step is encoded by one of the following terms: 

eqJndJ : VA : Type.Vx : AVP 
eqJnd-r : VA : Type.Vx : AVP 

where left (_1) and right (_r) must be understood w.r.t. backward application, 
and where P is the one hole context that represents the position in which the 
superposition occurred. 



— ► Prop.P x — >• Vy : A.x = 


= y - 


^Py 


— > Prop.P x — >• Vy : A.y = 


= x - 


^Py 



A more liberal, but also slightly more expensive solution consists in indexing any 
equation and systematically try to read back each result of a superposition step in 
CIC, dropping it if it is not understood by the refiner. 
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At the end of the superposition procedure, if a proof is found, either a trivial 
goal has been generated, or a fact subsumes one of the active goals. In that 
latter case, we perform a rewriting step on the subsumed goal, so that we fall 
back into the previous case. Thus, when the procedure successfully stops, the 
selected clause is of the form s = t where s and t are unifiable. We call it 
the meeting point, because forward steps (superposition right) and backward 
steps (superposition left) meet together when this trivial clause is generated, 
to compose the resulting proof. To generate a CIC proof term, the clauses are 
topologically sorted, their free variables are explicitly quantified, and nested let- 
in patterns are used to build the proof. 

The most delicate point of the translation is closing each clause w.r.t. its free 
variables, since we should infer a type for them, and since CIC is an explicitly 
polymorphic language it is often the case that the order of abstractions docs 
matter (e.g. variables standing for types must in general be abstracted before 
polymorphic variables). 

The simplest solution is to generate so called "implicit" arguments leaving 
to the Matita refiner the burden of guessing them. 

For instance, superposing lencat : len A x + len Ay — len A (x@y) with 
cat A : x@(y@z) = (x@y)@z at the underlined position and in the given direction 
gives rise to the following piece of code, where question marks stand for implicit 
arguments: 



/• 






\ 


let clause_59 : 








Vw :?.Vx :?.Vy :?.Vz :?. 








len w (x@y) + len w z = len w (x@(y@z)) 








\w :?.Az :?.\x :?.\y :?. 








eq_ind_r (List w) ((x@y)@z)) 








(Xhole : List w.len w (x@y) + len w z — I 


en 


w 


hole) 


(lencat w (x@y) z) (x@(y@z)) (cat A w x 


y 


z) 


in 

J 



Note that w must be abstracted first, since it occurs in the (to be inferred) 
types for x,y and z. Also note the one hole context expressed as an anonymous 
function whose abstracted variable is named hole, corresponding to the position 
of x@y in the statement of lencat. 

The interesting point is that refining is a complex operation, using e.g. hints, 
and possibly calling back the automation itself: the interpretation of the proof 
becomes hence a dialog between the system and its automation components, 
aimed to figure out a correct interpretation out of a rough initial trace. 

A more sophisticated translation, aimed to produce a really nice, human- 
readable output in the form of a chain of equations, is described in [6]. 

4 Smart application 

The most interesting application of superposition (apart from its use for solving 
equational goals) is the implementation of a more flexible application tactic. As 
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a matter of fact, one of the most annoying aspects of formal development is 
the need of transforming notions to match a given, existing result. As explained 
in the introduction, most of these transformations are completely transparent 
to the typical mathematical discourse, and we would like to obtain a similar 
behaviour in interactive provers. 

Given a goal B and a theorem t: A — > B, the goal is to try to match B 
with B up to the available equational knowledge base, in order to apply t. We 
call it, the smart application of t to G. We use superposition in the most direct 



A- 




Fig. 4. Smart application 

way, exploiting on one side the higher-order features of CIC, and on the other 
the fact that the translation to first order terms does not make any difference 
between predicates and functions: we simply generate a goal B = B and pass 
it to the superposition tool (actually, it was precisely this kind of operation 
that motivated our original interest in superposition) . If a proof is found, B is 
transformed into B by rewriting and t is then normally applied. 

Superposition, addressing a typically undecidable problem, can easily diverge, 
while we would like to have a reasonably fast answer to the smart application 
invocation, as for any other tactic of the system. We could simply add a timeout, 
but we prefer to take a different, more predictable approach. As we already said, 
the overall idea is that superposition right steps - realising the saturation of the 
equational theory - should be thought of as background operations. Hence, at run 
time, we should conceptually work as if we had a confluent rewriting system, and 
the only operation worth to do is narrowing (that is, left superposition steps). 
Narrowing too can be undecidable, hence we fix a given number of narrowing 
operations to apply to each goal (where the new goal instances generated at 
each step are treated in parallel). The number of narrowing steps can be fixed 
by the user, but a really small number is usually enough to solve the problem if 
a solution exists. 

5 Examples 

Example 1. Suppose we wish to prove that the successor function is le- reflecting, 
namely 

(*) Vn, m.Sn < Sm — > n < m 

Suppose we already proved that the predecessor function is monotonic: 

monotonicjpred : Vn, m.n < m — > pred n < pred m 

We would like to merely "apply" the latter to prove the former. Just relying on 
unification, this would not be possible, since there is no way to match pred X < 
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pred Y versus n < m unless narrowing the former. By superposing twice with 
the equation Vn.pred(Sri) = n we immediately solve our matching problem 
via the substitution {X := Sn, Y := Sm}. Hence, the smart application of 
monotonicjpred to the goal n < m succeeds, opening the new goal Sn < Sm 
that is the assumption in (*). 

Example 2. Suppose we wish to prove n < m * n for all natural numbers n, m. 
Suppose we already proved that multiplication is left-monotonic, namely 

monotonia Jejtimes J : Vn, a, b.a < b — >a*n<b*n 

In order to apply this result, the system has to find a suitable ? a such that 
? a * n = n, that is easily provided by the identity law for times. 

Example 3. In many cases, we just have local equational variants of the needed 
results. Suppose for instance we proved that multiplication in le-reflccting in its 
right parameter: 

lejtimes jtoJe-times _r : Va, n, m.a * n < a * m — > n < m 

Since times is commutative, this also trivially implies the left version: 

monotonicJe_times_l : Va, n, m.n * a < m * a — > n < m 

Formally, suppose to have the goal n < m under the assumption (H) n*a < m*a. 
By applying I e dimes -to Je -times _r we obtain a new goal ? a * n <? a * m that is 
a smart variant of H . 

Example 4- Suppose we wish to prove that (H) a * {Sn) < a * (Sm) implies 
a*n < a*m 1 where S is the successor function (this is a subcase in the inductive 
proof that the product by a positive constant a is le-reflccting). Suppose we 
already proved that the sum is le-reflecting in its second argument: 

le-plusdolejplusjr : Va, n, m.a + n<a + m^n<m 

By applying this result we obtain the new goal ? a + a * n <? a + a * m, and if 
we have the expected equations for times, we can close the proof by a smart 
application of H. 

Example 5. Consider the goal n < 2 * m under the assumptions (H) < m and 
(HI) n < m. Suppose that we defined x<y&sx + l<y. Morevoer, by the 
defining equation of times we should know something like 2 * m = m + (m + 0) . 
Hence the goal is equal to n + 1 < m + (m + 0) , and the idea is to use again the 
monotonicity of plus (in both arguments) : 

lejplus n m : Va, b.n < m — > a < b — >n + a<m + b 

The smart application of this term to the goal n < 2 * m succeeds, generating 
the two subgoals n < m and 1 < m + 0. The former one is the assumption HI, 
while the latter is a smart variant of H . 



The precise shape depends by the specific equations available on times. 
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Example 6. Let us make an example inspired by the theory of programming 
languages. Suppose to have a typing relation J 1 h M : N stating that in the 
environment r the term M has type N . If we work in De Bruijn notation, the 
weakening rule requires lifting 

weak :r^M:N^r,A hf(M) : f(N) 

Suppose now we have an axiom stating that h * : □ where * and □ are two given 
sorts. We would like to generalize the previous result to an arbitrary (legal) con- 
text r . To prove this, we have just to apply weakenings (reasoning by induction 
on r). However, the normal application of weak would fail, since the system 
should be able to guess two terms M and N such ~\ l (M) = * and t 1 (-^ r ) = D- 
If we know that for any constant c, t 1 ( c ) = c (that comes from the definition of 
lifting) we may use such an equation to enable the smart application of weak. 

Performance In Figure 5 we give the execution times for the examples of smart 
applications discussed in the previous section (in bytecode). Considering these 
times, it is important to stress again that the smart application tactics does not 
take any hint about the equations it is supposed to use to solve the matching 
problem, but exploits all the equations available in the (imported sections of 
the) library. 

The important point is that smart application is fast enough to not disturb 
the interactive dialog with the proof assistant, while providing a much higher 
degree of flexibility than the traditional application. 



example 


applied term 


execution time 


1 


momonotonic-pred 


0.16s. 


2 


momonotonicJeJimesJ 


0.23s. 


3 


H : a * n < a * m 


0.22s. 


4 


H :a* (Sn) <a* (Sm) 


0.15s. 


5 


lejplus n m 


0.57s. 


6 


weak 


0.15s. 



Fig. 5. Smart application execution times 



6 Related works and systems 

Matita was essentially conceived as a light version of Coq [9] , sharing the same 
foundational logic (the Calculus of Inductive Constructions) and being partially 
compatible with it (see [4] for a discussion of the main differences between the 



The lifting operation f™ (M) is meant to relocate the term M under n additional 
levels of bindings: in other words, it increases by n all free variables in M. 
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two systems at kernel level). Hence, Coq is also the most natural touchstone 
for our work. The auto tactic of Coq does not perform rewriting; this is only 
done by a couple of specialized tactics, called auto rewrite and congruence. 
The first tactic carries out rewritings according to sets of oriented cquational 
rules explicitly passed as arguments to the tactic (and previously build by the 
user with suitable vernacular commands). Each rewriting rule in some base is 
applied to the goal until no further reduction is possible. The tactic does not 
perform narrowing, nor any form of completion. The congruence tactic imple- 
ments the standard Nelson and Oppen congruence closure algorithm [21], which 
is a decision procedure for ground equalities with uninterpreted symbols; the 
Coq tactic only deals with equalities in the local context. Both Coq tactics are 
sensibly weaker than superposition that seems to provide a good surrogate for 
several decision procedures for various theories, as well as a simple framework 
for composing them (see e.g [2]). 

Comparing the integration of superposition in Matita with similar function- 
alities provided by Isabelle is twofold complex, due not only to the different 
approaches, but also to the different underlying logics. 

In Isabelle, equational reasoning can be both delegated to external tools or 
dealt with internally by the so called simplifier. Some of the the external tools 
Isabelle is interfaced with provide full support to paramodulation (and hence 
superposition), but the integration with them is stateless, possibly requiring to 
pass hundreads of theorems (all the current visible environment) at each invo- 
cation. In Matita, the active set is persistent, and grows as the user proves new 
equations. 

Of more interest is the comparison with Isabelle's internal simplifier. The inte- 
gration of this tool with the library is manual: only lemmas explicitly labelled 
and oriented by the user are taken into account by the simplifier. Moreover, 
these lemmas are only used to demodulate and are not combined together to 
infer new rewriting rules. Nevertheless, a pre-processing phase allows the user to 
label theorems whose shape is not an equation. For example a conjunction of two 
equations is interpreted as two distinct rewriting rules, or a negative statement 
-i A is understood as A — False. The simplifier is also able to take into account 
guarded equations as long as their premises can be solved by the simplifier itself. 
Finally it detects equations that cannot be oriented by the user, like commuta- 
tivity, and restricts their application according to the demodulation rule using 
a predefined lexicographic order. 

Anyway, the main difference from the user's perspective comes from a deep rea- 
son that has little to do with the simplifier or any other implemented machinery. 
Since Isabelle is based on classical logic, co-implication can be expressed as an 
equality. Hence, in Isabelle we can prove much more equations at the prosi- 
tional level and use them for rewriting. Any concrete comparison between the 
two provers with respect to equational reasoning is thus inherently biased, since 
many problems encountered in one system would look meaningless, artificial or 
trivial when transposed into the other one. 
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7 Conclusions 

We described in this paper the "smart" application tactic of the Matita interac- 
tive theorem prover. The tactics allow the backward application of a theorem to 
a goal, where matching is done up to the data base of all equations available in 
the library. The implementation of the tactics relies on a compact superposition 
tool, whose architecture and integration within Matita have been described in the 
first sections. The tool is already performant (it was awarded best new entrant 
tool at the 22nd CADE ATP System Competition) but many improvements can 
still be done for efficiency, such as the implementation of more sophisticated data 
structures for indexes (we currently use discrimination trees). 

Another interesting research direction is to extend the management of equal- 
ity to setoid rewriting [27]. Indeed, the current version of the superposition tool 
just works with an intensional equality, and it would be interesting to try to 
figure out how to handle more general binary relations. The hard problem is 
proof reconstruction, but again it seems possible to exploit the sophisticated 
capabilities of the Matita refiner [3] to automatically check the legality of the 
rewriting operation (i.e. the monotonicity of the context inside which rewriting 
has to be performed) , exploiting some of the ideas outlined in [26] . 

One of the most promising uses of smart application is inside the backward- 
based automation tactic of Matita. In fact, smart application allows a smooth 
integration of cquational reasoning with the prolog-like backward applicative 
mechanisms that, according to our first experimentations looks extremely promis- 
ing. As a matter of fact, the weakest point of smart application is that it does not 
relieve the user form the effort of finding the "right" theorems in the library or 
of guessing/remembering their names (although it allows to sensibly reduce the 
need of variants of a given statement in the repository) . A suitably constrained 
automation tactic could entirely replace the user in the quest of candidates for 
the smart application tactic. Since searching is a relatively expensive operation, 
the idea is to ask the automation tactic to return an explicit trace of the resulting 
proof (essentially, a sequence of smart applications) to speed-up its re-execution 
during script development. 
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